Genetic pattern and demographic history of cutlassfish (Trichiurus nanhaiensis) in South China Sea by the influence of Pleistocene climatic oscillations

Trichiurus nanhaiensis is one of the most important commercial fish species in the South China Sea. This study aimed to investigate the level of genetic variation and population genetic structure of T. nanhaiensis in the South China Sea for the first time, using 281 individuals collected from seven locations along the coast of mainland China, Taiwan, and Hainan Island. A high level of haplotype diversity and low nucleotide diversity were detected in the mitochondrial DNA cyt b gene and nuDNA RYR 3 gene. The overall expected heterozygosity (He = 0.693) among the seven populations ranged from 0.681 to 0.706 in microsatellite DNA data, which revealed high levels of genetic diversity. Significant genetic differentiation was found in Taidong populations in Taiwan, revealing the prevention of gene flow caused by the Kuroshio Current. Two major lineages based on the cyt b gene suggested that the Taiwan Strait acted as a geographic barrier for T. nanhaiensis during the glacier periods in the late Pleistocene. The Bayesian skyline plot also revealed that population demographic expansion of T. nanhaiensis was estimated to have occurred in 0.1 Mya. Our results indicated that all populations of T. nanhaiensis had experienced a recent genetic bottleneck following recent expansion based on ABC analysis.


Results
Mitochondrial and nuclear DNA. A total of 181 and 57 haplotypes were identified by sequencing 1141 bp and 882 bp of the complete mtDNA cyt b gene and nuclear RYR 3 gene from 281 T. nanhaiensis individuals, respectively (Table 1, Fig. 1). The most common haplotypes of the mtDNA cyt b gene (Hm8, Hm18, and Hm27) (Supplementary Table S1) and nuclear RYR 3 gene (Hn1, Hn2, Hn6, and Hn8) were shared by specimens from seven and all populations, respectively (Supplementary Table S2). The average haplotype diversity of the mtDNA cyt b gene was high (0.989), ranging from 0.981 (ST) to 0.995 (ZJ), and the average nucleotide diversity (θ π ) was www.nature.com/scientificreports/ low (0.005), ranging from 0.004 (ZJ) to 0.005 (YJ) ( Table 1). The average haplotype diversity of the nuDNA RYR 3 gene was high (0.724), ranging from 0.650 (SZ) to 0.812 (SY), and the average nucleotide diversity (θ π ) was low (0.001), ranging from 0.001 (SZ) to 0.002 (SY) ( Table 1). The pairwise F ST values ranged from -0.011 (between SY and YJ) to 0.084 (between TD and SZ), with a mean value of 0.023 in the mtDNA cyt b gene. The pairwise F ST values between the populations from TD and other populations were differentiated, except for the YJ population in the mtDNA cyt b gene (Supplementary Table S3). The pairwise F ST values ranged from 0.001 (between ST and BH) to 0.060 (between YJ and ZJ), with a mean value of 0.020 in the nuDNA RYR 3 gene. No significant correlation was found between genetic distance and geographic distance by using IBD analyses (r = 0.438, p = 0.087).
Hierarchical analyses of molecular variance (AMOVA) from the probable factors shaping genetic structure demonstrated that most of the genetic variation was among individuals within populations in the mtDNA cyt b gene and nuclear RYR 3 gene, i.e., one group (97.79%, 98.42%), two groups (Scenario I, 94.47%, 97.46%), two groups (Scenario II, 99.00%, 98.74%), and three groups (Scenario III, 97.30%, 98.32%) ( Table 2). However, only 2.21%, 4.19%, 1.87%, and 0.97% of the total variation was found among the groups in one group, two groups (Scenario I), two groups (Scenario II), and three groups (Scenario III) in the mtDNA cyt b gene, respectively ( Table 2). Phylogenetic trees and haplotype networks based on the cyt b gene support the formation of two major lineages, and genetic clusters do not correspond to geographical sampling sites (Figs. 2, 3). However, the phylogenetic relationship and networks based on the nuDNA fragments showed no evidence of significant geographical structure corresponding to the sampling locations (Supplementary Figs. S1, S2). Neutrality tests, mismatch distribution analysis, and Bayesian skyline plots were used to analyze signatures of historical demographic events. The significant negative values of Tajima's D and Fu's Fs tests demonstrated deviation from the neutrality test, suggesting past population expansion for T. nanhaiensis (Tajima's D, − 2.221, P < 0.001 and Fu's Fs, − 358.690, P < 0.000). Likewise, the same results were supported by the presence of a star-like topology in the cyt b gene tree and by the results of mismatch distribution analysis, which showed unimodal distributions ( Supplementary Fig. S3). The Bayesian skyline plot showed that T. nanhaiensis populations seem to have experienced demographic expansion starting approximately 100,000 years ago and a flat population history during the Pleistocene (Fig. 4). The current (θω) and historical (θπ) genetic diversity was 0.017 and 0.0046, respectively. When the current genetic diversity was larger than the historical genetic diversity, the T. nanhaiensis populations revealed a pattern of decline 22 .
Microsatellite DNA. No indication of genotyping errors due to stutter bands, large allele dropout, or null alleles was revealed using a microchecker 23   Each color in the pie charts represents the frequencies of the private (orange) and shared (yellow) haplotypes in each population. CCC China Coastal Current. The sampling site/population code is described in Table 1 Table S5). The results showed a nonsignificant correlation within IBD analyses between genetic differentiation and geographic distance using microsatellite DNA data (r = 0.618, p = 0.031). AMOVA showed that most of the genetic diversity was within individuals, i.e., one group (86.42%), two groups (Scenario I, 85.92%), two groups (Scenario II, 86.39%), and three groups (Scenario III, 86.26%) ( Table 2). When the populations were divided into two groups (Scenario I), two groups (Scenario II), and three groups (Scenario III) according to the geographical barriers, only 0.73%, 0.06%, and 0.37% of the total variation was found among the groups, respectively. We used the Bayesian model-based clustering algorithm implemented in STRU CTU RE 2.3.3 software to assign individuals into populations by estimated individual admixture proportions and to explore different numbers of populations K to the population structure based on microsatellite data. The most likely number of clusters was determined by Structure Harvester, which indicated that all seven populations were separated into two distinct genetic clusters (K = 2, Ln P(K) = − 172.360; Stdev LnP(K) = 27.976; Delta K = 6.036) (Supplementary Table S6). Although the results of the STRU CTU RE analysis supported K = 2, all sampled individuals exhibited admixture among two clusters with roughly equal contributions (Fig. 5). Microsatellite DNA data were used to provide a platform for data visualization and construction of a heatmap representing the allele frequency for all 13 loci in the 7 populations, as presented in Fig. 6. The interpretation of the color intensity was as follows: the relationship between the color and the allele frequency is given by the color bar on the right. The firebrick color indicates the highest value of allele frequency. The heatmap of allele distribution and frequency across  . 8II). The prior distribution of parameters corresponding to the five scenarios tested (see Fig. 8III). This scenario indicated that T. nanhaiensis experienced a reduction in the effective population size in the past, followed by a single instantaneous increase in population size.   The values of the average number of alleles per population (9.0) of T. nanhaiensis were relatively lower than the average number of alleles per population of the relative species L. savala and other marine fish populations along the coast of mainland China (e.g., N. albiflora N a = 12.020 31 ; Lateolabrax maculatus N a = 28.727 32 ; Larimichthys polyactis N a = 24.950 33 ). In this study, the average value of expected heterozygosity (0.693) of T. nanhaiensis was relatively lower than the average value of expected heterozygosity of marine fish (average H E = 0.790) with many marine fish populations 34 . In this study, the positive inbreeding coefficient (F IS = 0.145) of T. nanhaiensis indicates heterozygosity deficiency in all populations, which might be attributed to inbreeding. Previous studies have reported that low observed heterozygosity points to a deficiency of heterozygotes and high levels of inbreeding in the waters off mainland China (e.g., Larimichthys polyactis 33 ; Collichtys lucidus 35 ; Lateolabrax maculatus 32 ), as well as in other related species (e.g., L. savala 10 ). In addition, the loss of genetic diversity may reflect past historical events associated with current evolutionary forces, such as fishing pressure or habitat loss 19 .  Table 1) delimited by a black solid vertical line is reported on the x-axis.

Phylogeography and population structure of Trichiurus nanhaiensis.
Many previous studies suggested that the major drop in sea level in the Taiwan Strait acted as a biogeographic barrier during major falls in sea level in the Pleistocene and might have cut off migration on either side of the strait and presented two lineages in marine fishes, such as the genera Helice 36 , B. sinensis 9 , and L. savala 10 . Our results revealed that haplotype networks and phylogenetic trees support the formation of two major lineages based on the cyt b gene, and genetic clusters do not correspond to geographical groups. During the Pleistocene, the East China Sea and the South China Sea were semienclosed marginal seas and separated by the Taiwan Strait. We suggested that the Taiwan Strait acted as a geographic barrier for T. nanhaiensis during the glacier periods in the late Pleistocene and gene flow due to the absence of physical barriers during glacial retreat, while more recent secondary contacts could have played a major role 37 . The results of the AMOVA test based on mitochondrial, nuclear, and microsatellite DNA data indicated that there were no significant differences at all hierarchical levels. All genetic markers, including mitochondrial, nuclear, and microsatellite markers, congruently showed a similar genetic pattern, revealing overall low genetic differentiation among T. nanhaiensis populations, by previous studies on other marine fishes among the geographical stocks in the South China Sea (e.g., L. savala 10 ; Nemipterus bathybius 38 ). The results from STRU CTU RE indicated the presence of two clusters after independent clustering of all populations based on microsatellite DNA data, which may be explained by the high dispersal capacity of T. nanhaiensis. The scenario of evolutionary diversification of lineages followed by postglacial expansion facilitating secondary contacts was observed in previous studies (e.g., S. macrorhynchos 39 ; Scatophagus argus 28 ; L. savala 10 ). A previous study in Trichiurus species suggests a genetic well mixture and a lack of distribution barrier for T. nanhaiensis 17 .
Ocean currents, which change with the monsoon and are highly variable in the SCS, play an important role in transporting eggs, larvae, or juveniles of T. nanhaiensis 40,41 . We suggested that the high dispersal ability of marine fish may be appeared by gene flow following divergence, indicating that divergent mitochondrial lineages arose within a panmictic population. The results from DAPC and the heatmap of allele distribution and frequency across populations revealed that population TD was distinctly separated from the other geographic populations. Population TD was located in southern Taiwan, and samples were collected from the Pacific Ocean. We suggest that population TD belonging to peripheral populations may always have low genetic variability due to isolation and founder effects, but the differentiation of the populations is high owing to genetic drift and reduced gene flow under the influence of the Kuroshio Currents 42 .  Table 1 for the sampling sites/population codes.

Materials and methods
Sample collection, DNA extraction, and microsatellite genotyping. A total of 281 T. nanhaiensis were collected from seven locations in the South China Sea in 2019 (Table 1, Fig. 1). All studies in animals were conducted by ethical committee guidelines and approved by the Animal Research and Ethics Committee of College of Fisheries, Guangdong Ocean University (China), and the study was carried out in compliance with the ARRIVE guidelines. Fin tissues were stored in 95% alcohol for DNA extraction after morphological identification of all samples. Total DNA was extracted using a DNA extraction kit (Sangon Biotech, Shanghai, China). The mitochondrial cytochrome b (Cyt b) gene was amplified using the gsCY4-F (5′-TCC CTC ATT GAC CTT CCA -3′)/gsCY4-R (5′-GGT TGC GGT TCA GTT GAG -3′) primer pair, and the nuclear DNA gene (its proteincoding sequence) for ryanodine receptor 3 (RyR3) was amplified using the primers RYR3-F22 (5′-TCG GTA   18 . Pairwise F ST between all pairs of populations and analysis of molecular variance (AMOVA) were executed in Arlequin 3.5 48 . Seven populations were grouped with three scenarios based on geographical barriers: (1) Scenario I: two independent groups divided by the Taiwan Strait (TD; ST, SZ, YJ, ZJ, BH, SY); (2) Scenario II: two independent groups divided by the Qiongzhou Strait (BH; ST, SZ, YJ, ZJ, SY, TD); (3) Scenario III: three independent groups that included the mainland group, the Taiwan island group and the Hainan island group (ST, SZ, YJ, ZJ; TD; SY, BH). In addition, an isolation-by-distance Mantel test was performed to examine correlations between genetic differentiation and geographic distance by IBD 1.52 for Windows 53 . The Google Earth database (http:// earth. google. com) was used to measure the geographic distance in kilometres from GPS locations between sampling sites.
Microsatellite DNA analysis. All loci were tested for possible genotyping errors, the presence of null alleles, and allelic dropout using MICRO-CHECKER 2.2.3 software 23 . In this study, Arlequin 3.5 48 was used to calculate the number of alleles (N a ), mean observed heterozygosity (H O ), mean expected heterozygosity (H E ), and deviation from Hardy-Weinberg equilibrium (HWE) for each population. Allelic richness (A R ) and inbreeding coefficient (F IS ) 54 were estimated using FSTAT version 2.9.3 55 . Population differentiation of T. nanhaiensis was studied by estimation of the F ST and R ST using Arlequin 48 and AMOVA procedures as mitochondrial DNA data (mentioned above).
STRU CTU RE v.2.2.3 was used to delineate the optimum number of homogenous groups of sampled individuals (K) 56 . A series of 10 independent runs for each K (ranging from 2 to 5) was performed using 100,000 iterations of the Markov chain Monte Carlo (MCMC) with a burn-in period of 20,000 chains. Structure Harvester Web 0.6.94 was used to determine that the optimum number of genetic clusters within the data were predicted using the delta K method and Evanno method 57 . GenAIEx version 6.503 was used to calculate allelic frequency 58 , and the R package "pheatmap" was used for data visualization and allele frequency heatmap construction. Moreover, to investigate the genetic relationship between populations, discriminant analysis of principal components (DAPC) 59 was performed using the '"adegenet 2.0.1"' package in R 3.2.2 software 60 .

Approximate Bayesian computation (ABC) analysis. The approximate Bayesian computation (ABC)
method is incorporated into different historical and demographic evolutionary models that can be suited to complex problems that arise in scenarios in population genetics. To explore the possible historical demography of T. nanhaiensis, we used a coalescence analysis implemented in the program DIYABC v.2.0.4 61 to obtain relevant and detailed information on the population history. We separately tested the five scenarios of possible demographic changes for T. nanhaiensis based on Cyt b and RyR3 sequences and microsatellite data (Fig. 8). The HKY was the best evolutionary model for our sequence dataset. The following scenarios were used: In scenario 1 (CON model), assumed constant population size of T. nanhaiensis; In scenario 2 (DEC model), assumed populations of T. nanhaiensis decline and had experienced a bottleneck event; In scenario 3 (INC model), assumed populations of T. nanhaiensis expansion recently; In scenario 4 (INCDEC model), assumed populations of T. nanhaiensis expanded followed by a single instantaneous decrease in population size; and in scenario 5 (DECINC model), assumed populations of T. nanhaiensis bottleneck followed by re-expansion. To obtain robust results, a total of 3,000,000 simulations were performed as recommended by DIYABC for all five demographic scenarios (Fig. 8). We calculated all the summary statistics included in the software DIYABC. The intervals of the prior